Balancing Assistance and Integrity
Reflective Piece (Literature Review)

View Submitted Literature Review

What?

This assignment required a standalone literature review that demonstrated a current understanding of the topic, use of relevant sources, and critical comparison of findings and methodologies. I chose the topic of Large Language Models in academic writing and learning support because it felt genuinely tied to the future of how academic work will be produced and assessed. On a personal level, it also challenged a bias I did not realise I had. I had dismissed chatbots for a long time as a weaker substitute for real human support, then I watched the leap in capability over the last couple of years, including how strong even free tier models can be, and I realised I needed to move from opinion to evidence.

In my submission, I researched fifteen peer reviewed sources from 2023 to 2025, with earlier work used selectively to anchor longer standing concepts such as plagiarism and academic integrity. I structured the review around three connected themes, LLMs as learning support, integrity risks and AI-giarism, and governance, literacy, and equity. The goal was not to simply summarise studies, but to show where researchers converge, where they disagree, and what those disagreements imply for practice.

So what?

The most important shift for me came from seeing how fragile the boundary is between “support” and “substitution”. Several studies suggest LLMs can improve surface features of writing and deliver fast, rubric aligned feedback at scale, which is especially relevant for students working in an additional language (Xu and Zhou, 2023; Tai et al., 2024; Fan et al., 2024; Dai et al., 2023). However, the same strengths that make these tools helpful also create integrity pressure, fluent text can hide the student’s actual contribution, and AI generated work can bypass traditional similarity checking (Bittle and El-Gayar, 2025; Susnjak and McIntosh, 2024). Writing the review forced me to stop treating integrity as a detection problem and start treating it as an authorship and accountability problem.

That reframing connected directly to my role in an aviation college exams department. Our environment is high volume and high stakes, thousands of exam records, multiple attempts, trend reporting across cohorts and semesters, and decisions that can affect how a faculty is perceived and what interventions are triggered. The literature made me think about how quickly institutions reach for tool based “certainty” when they feel threatened, whether that is plagiarism detection in universities or a single headline metric in performance reporting. In both cases, the temptation is the same, simplify something complex until it feels controllable. The problem is that simplification can quietly become unfairness. Research highlighting false positives, false negatives, and disproportionate impacts matters because once trust is damaged, whether in academic integrity processes or assessment reporting, it is hard to recover (Pagaling et al., 2024; Chan, 2025).

The professor feedback I received, 76/100, confirmed that my strengths were currency of sources, critical comparison, and linking integrity, pedagogy, and policy. At the same time, the comment that some sections were dense and occasionally redundant landed for a reason. I think the density came from trying to honour the complexity of the debate without narrowing the lens enough. In hindsight, I can see that critical writing is not only about including more evidence. It is also about choosing fewer, stronger pieces of evidence and guiding the reader through the logic with clearer signposts. That is a useful lesson for me professionally too, because when I produce reports for faculty, clarity often matters more than completeness, and a busy reader will miss the point if the narrative is overloaded.

Now what?

In future literature reviews, I will keep the same critical approach but tighten the presentation. Practically, that means being more ruthless about redundancy, using clearer signposting to show how sections connect, and concentrating on a smaller set of the most persuasive findings. It also means being explicit about what different study designs can and cannot conclude, especially around long term outcomes like independence, originality, and overreliance, which systematic reviews suggest remain under evidenced compared to short term fluency gains (Almoraie and Alhejaili, 2025).

More broadly, this topic has changed how I think about the future of writing and assessment. The literature does not support a simple ban versus acceptance debate. The more defensible direction is structured integration, clearer disclosure norms, explicit AI literacy, and assessment approaches that value process and understanding, not just polished outputs (Kasneci et al., 2023; Ramírez-Montoya and Lugo-Ocando, 2024; Zawacki-Richter and Marín, 2024). In my own workplace reporting, I want to apply the same principle, design for accountability and fairness through transparent definitions, documented choices, and reporting structures that make complexity visible without becoming unreadable.

References

  • Almoraie, A. and Alhejaili, M. (2025) ‘A systematic literature review to implement large language model in education’, Education and Information Technologies (online). Available at: https://link.springer.com/article/10.1007/s44217-025-00424-7 (Accessed 29 November 2025).
  • Bittle, K. and El-Gayar, O. (2025) ‘Generative AI and academic integrity in higher education: A systematic review and research agenda’, Information, 16(4), p. 296. doi:10.3390/info16040296. Available at: https://www.mdpi.com/2078-2489/16/4/296 (Accessed 6 December 2025).
  • Bretag, T. and Mahmud, S. (2016) ‘A conceptual framework for implementing exemplary academic integrity policy in Australian higher education’, in Bretag, T. (ed.) Handbook of Academic Integrity. Singapore: Springer, pp. 463–480. doi:10.1007/978-981-287-098-8_24.
  • Chan, C.K.Y. (2025) ‘Students’ perceptions of “AI-giarism”: investigating changes in understandings of academic misconduct’, Education and Information Technologies, 30, pp. 8087–8108. Available at: https://link.springer.com/article/10.1007/s10639-024-13151-7 (Accessed 6 December 2025).
  • Dai, H. et al. (2023) ‘AI-generated feedback on writing: insights into efficacy and ENL student writing’, International Journal of Educational Technology in Higher Education, 20(1), p. 58. Available at: https://educationaltechnologyjournal.springeropen.com/articles/10.1186/s41239-023-00425-2 (Accessed 25 November 2025).
  • Fan, X., Li, R. et al. (2024) ‘Comparing the quality of human and ChatGPT feedback of students’ writing’, Journal of Writing Research (online). Available at: https://www.sciencedirect.com/science/article/pii/S0959475224000215 (Accessed 27 November 2025).
  • Howard, R.M. (1995) ‘Plagiarisms, authorships, and the academic death penalty’, College English, 57(7), pp. 788–806.
  • Kasneci, E. et al. (2023) ‘ChatGPT for good? On opportunities and challenges of large language models for education’, Learning and Individual Differences, 104, 102274. Available at: https://www.sciencedirect.com/science/article/pii/S1041608023000195 (Accessed 24 November 2025).
  • Lund, B., Mannuru, N.R., Teel, Z.A., Lee, T.H. and Ortega, N.J. (2026) ‘Student perceptions of AI assisted writing and academic integrity, ethical concerns, academic misconduct, and use of generative AI in higher education’, AI in Education, 1(1), 2. Available at: https://www.mdpi.com/3042-8130/1/1/2 (Accessed 6 December 2025).
  • Pagaling, H., Cochrane, A. and Philp, J. (2024) ‘Academic integrity or academic misconduct? Conceptual and practical tensions in higher education’, Higher Education Research & Development, 43(8), pp. 1564–1584. Available at: https://doi.org/10.1080/07294360.2024.2339833 (Accessed 6 December 2025).
  • Ramírez-Montoya, M.S. and Lugo-Ocando, D. (2024) ‘The impact of large language models on higher education’, Frontiers in Education, 9, 1392091. Available at: https://www.frontiersin.org/articles/10.3389/feduc.2024.1392091 (Accessed 28 November 2025).
  • Scholars (2024) ‘Student perceptions of ChatGPT: benefits, costs, and attitudinal differences between users and non-users toward AI integration in higher education’, Education and Information Technologies (online). Available at: https://link.springer.com/article/10.1007/s10639-025-13575-9 (Accessed 23 November 2025).
  • Shen, S. (2023) ‘Application of large language models in the field of education’, International Journal of Emerging Technologies in Learning, 18(4), pp. 24–35. Available at: https://www.researchgate.net/publication/380177498_Application_of_large_language_models_in_the_field_of_education (Accessed 25 November 2025).
  • Susnjak, T. and McIntosh, T.R. (2024) ‘ChatGPT: The end of online exam integrity?’, Education Sciences, 14(6), 656. Available at: https://doi.org/10.3390/educsci14060656 (Accessed 27 November 2025).
  • Tai, M. et al. (2024) ‘Impact of ChatGPT on ESL students’ academic writing skills’, Smart Learning Environments, 11, 2. Available at: https://slejournal.springeropen.com/articles/10.1186/s40561-024-00295-9 (Accessed 29 November 2025).
  • Xu, J. and Zhou, Y. (2023) ‘The impact of AI writing tools on the content and organisation of EFL student writing’, Cogent Education, 10(1), 2236469. Available at: https://www.tandfonline.com/doi/full/10.1080/2331186X.2023.2236469 (Accessed 26 November 2025).
  • Zawacki-Richter, O. and Marín, V.I. (2024) ‘ChatGPT in higher education: a synthesis of the literature and a bibliometric analysis’, Educational Technology Research and Development (online). Available at: https://link.springer.com/article/10.1007/s10639-024-12723-x (Accessed 30 November 2025).